


Institutional Archive of the Naval Postgraduate School 





Calhoun: The NPS Institutional Archive 
DSpace Repository 


Theses and Dissertations 1. Thesis and Dissertation Collection, all items 


1979 


Automatic factorization of generalized upper 
bounds in large scale optimization problems. 


Thomen, David Samuel 


Monterey, California. Naval Postgraduate School 


http://ndl.handle.net/10945/18624 


Downloaded from NPS Archive: Calhoun 


: Calhoun is the Naval Postgraduate School's public access digital repository for 
/ (8 D U DLEY research materials and institutional publications created by the NPS community. 
«ist : Calhoun is named for Professor of Mathematics Guy K. Calhoun, NPS's first 


NY KNOX appointed — and published -- scholarly author. 

; | LIBRARY Dudley Knox Library / Naval Postgraduate School 

411 Dyer Road / 1 University Circle 
Monterey, California USA 93943 





http://www.nps.edu/library 


AUTOMATIC FACTORIZATION OF GENERALIZED 
UPPER BOUNDS IN LARGE SCALE OPTIMIZATION 
PROBLEMS 


David Samuel Thomen 








NAVAL POSTGRADUATE SCHOOL 


Monterey, California 





IHL ol 


Automatic Factorization of Generalized Upper Bounds 
in Large Scale Optimization Problems 


by 


David Samuel Thomen 


September 1979 


Thesis Advisor: Gerald G. Brown 





Approved for public release; distribution unlimited 


T191035 








UNCLASSIFIED 


SECURITY CLASSIFICATION OF THIS PAGE (Wren Data Entered) 


REPORT DOCUMENTATION PAGE 


a # #=€f ——— 


4. TITLE (and Subtttie) 








READ INSTRUCTIONS 
BEFORE COMPLETING FORM 


S. TYPE OF REPORT &@ PERIOD COVERED 
Master's Thesis; 
September 1979 


6. PERFORMING ORG. REPORT NUMBER 


7. AUTHOR(e) 8. CONTRACT OR GRANT NUMBER(2@) 












Automatic Factorization of Generalized Upper Bounds 
in Large Scale Optimization Problems 








David Samuel Thomen 


9. PERFORMING ORGANIZATION NAME ANO AOCORESS 10. PROGRAM ELEMENT, a eer TASK 
AREA & WORK UNIT NUMBER 


12. REPORT OATE 
NUMBER OF PAGES 


18. SECURITY CLASS. (of thte report) 


UNCLASSIFIED 


$e. OECL ASSIFICATION/ DOWNGRADING 
SCH EOULE 


Naval Postgraduate School 
Monterey, Califommia 93940 


CONTROLLING OF FICE NAME ANO AOORESS 













Naval Postgraduate School 
Monterey, California 93940 


. MONITORING AGENCY NAME @ ADODRESS(If different from Controlling Office) 


















Naval Postgraduate School 
Monterey, California 93940 















16. DISTRIBUTION STATEMENT (of thie Report) 





Approved for public release; distribution unlimited 






17. OISTRIBUTION STATEMENT (of the sbetrect entered in Block 20, ti different from Report) 





18. SUPPLEMENTARY NOTES 


19. KEY wOROS (Continue on reveree side if necescesary and identify by block number) 


Generalized upper bounds, GUB, exclusive row structure, ERS, signed identity 
factorization. 





20. ABSTRACT (Continue an reveree side if neceeesary and identity by aleck mumber) 
To solve contemporary large scale linear, integer and mixed integer programming problems, 






it is often necessary to exploit intrinsic special structure in the model at hand. One commonly 






used technique is to identify and then to exploit in a basis factorization algorithm a generalized 






upper bound (GUB) structure (also called a static signed identity basis factorization). This 






report compares several existing methods for identifying GUB structure. Computer programs 






have been written to permit comparison of computational efficiency. The GUB programs have 





DO , er 72 1473 = €or TION OF | NOV 6818 CASOLETE UNCLASSIFIED 


(Page 1) 1 —- SECURITY CLASSIFICATION OF THIS PAGE (When Date Entered) 





UNCLASSIFIED 


a RE 
GOcCUMTY CLASSIFICATION OF TwiS PAGEWren Note Katered. 





20. continued 


been incorporated in an existing optimization system of advanced design and have been tested 
on a variety of large scale real life optimization problems. The identification of GUB sets of 
maximum size is shown to be among the class of NP-complete problems; these problems are 
widely conjectured to be intractable in that no polynominal-time algorithm has been demon- 
strated for solving them. All the methods discussed in this report are polynominal-time heuristic 
algorithms that attempt to find, but do not guarantee, GUB sets of maximum size. Bounds for 
the maximum size of GUB sets are developed, in order to evaluate the effectiveness of the 


heuristic algorithms. 


DD. For 1473 
4 da NG 9 UNCLASSIFIED 
S/N 0102-014-6601 


ET TR A OAT N DELILE 
SECURITY CLABSIFIC ATION QF THIS PAGESWREn Date Entered) 


Approved for public release; distribution unlimited 


Automatic Factorization of Generalized Upper Bounds 
in Large Scale Optimization Problems 


by 


David Samuel Thomen 
Captain, United States Marine Corps 
B.A., Alma College, 1971 


Submitted in partial fulfillment of the 
requirements for the degree of 


MASTER OF SCIENCE IN OPERATIONS RESEARCH 


from the 


NAVAL POSTGRADUATE SCHOOL 
September 1979 


\\W tet 
7 Qe 7 
o. \ 





ABSTRACT 

To solve contemporary large scale linear, integer and mixed integer programming problems, 
it is often necessary to exploit intrinsic special structure in the model at hand. One commonly 
used technique is to identify and then to exploit in a basis factorization algorithm a generalized 
upper bound (GUB) structure (also called a static signed identity basis factorization). This 
report compares several existing methods for identifying GUB structure. Computer programs 
have been written to permit comparison of computational efficiency. The GUB programs have 
been incorporated in an existing optimization system of advanced design and have been tested 
on a variety of large scale real life optimization problems. The identification of GUB sets of 
maximum size is shown to be among the class of NP-complete problems; these problems are 
widely conjectured to be intractable in that no polynominal-time algorithm has been demon- 
strated for solving them. All the methods discussed in this report are polynominal-time heuristic 
algorithms that attempt to find, but do not guarantee, GUB sets of maximum size. Bounds for 
the maximum size of GUB sets are developed, in order to evaluate the effectiveness of the 


heuristic algorithms. 
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I. INTRODUCTION 

Contemporary mathematical programming models are often so large that direct solution of 
the associated linear programming (LP) problems with the classical simplex method is prohibi- 
tively expensive, if not impossible in a practical sense. It has been found that most of these prob- 
lems are typically sparse, = relatively few non-zero coefficients, and usually possess very sys- 
tematic structure. These problems exhibit inherent structural characteristics that can be ex- 
ploited by specializations of the simplex procedure. Various types of regularity are often 
described as, for instance, block angular, staircase, and so forth; terms all chosen to describe the 
visual appearance of the non-zero coefficients when the rows and columns of the problem are 
conveniently ordered. There are profound economic, managerial and mathematical motives for 
this special structure, which are not examined here. 

Methods to exploit special model structure can be generally categorized as indirect (e.g., 
decomposition), where a solution to the onginal problem is achieved by dealing with related 
models which are individually easier to solve, or as direct, when the original problem is solved 
by a modified simplex algorithm. 

Among the direct exploitation methods, the most frequently used technique is called basis 
factorization, [6] where the reflection of special problem structure appears and is used to good 
benefit in the intermediate LP bases. Basis factorization can be dynamic, where the algorithm 
deals with each basis sequentially and/or independently in an attempt to extract as much 
specialized basis structure as possible, or static, where the algorithm depends upon certain types 
of special structure to be present in all bases. 

Static basis factorizations include simple upper bounds, generalized upper bounds (GUB), 
and embedded network rows, among many others. Simple upper bounds are a set of rows for 
which each row has only one non-zero coefficient. Generalized upper bounds are a set of rows 
for which each column (restricted to those rows) has at most one non-zero coefficient. Network 
rows are a set of rows for which each column (restricted to those rows) has at most two non- 


zero coefficients of opposite sign. 





Each of these factorizations permits the simplex algorithm to deal with the static subsets 
of the rows (and columns) of all bases encountered with prior knowledge that they will satisfy 
very restricted rules. Most of these methods work best when logic can be substituted for arith- 
metic (as is the case with the coefficients + 1). For this reason, static factorizations often re- 
strict the special structure to possess only + 1, or to be scaled producing an equivalent result. 

The concept of Generalized upper bounds was introduced in 1964, the result of work by 
Dantzig and Van Slyke [4]. The name is derived from analogy to the simple upper bound 
structure. Graves and McBride [6] refer to Signed Identity Factorization as a term more sug- 


gestive of the implied basis structure than GUB. Since their introduction, some form of GUB 
has been implemented in many commercial LP systems. There is often confusion between the 
mathematical characterization of GUB and these various, widely used implementations of GUB, 
in that the latter often restrict the GUB set membership rules to permit uncomplicated simplex 
logic. All of the methods reported here address the full generality of GUB sets but can be 
modified as necessary to produce restricted GUB sets. 

A group of rows collectively form a GUB set (or static identity basis factorization) if they 
do not conflict and can be transformed by simple row and column scaling into GUB constraints. 
Two rows are said to conflict if there exists at least one variable with non-zero coefficients in 
both rows. 

The details of how GUB structure may be exploited to reduce the computations of the 
simplex algorithm are not discussed here. See [1, 4, 6, 10,12]. The underlying concept is that 
the GUB structure enables the simplex algorithm to manipulate the GUB rows implicitly, with 
logic rather than floating point arithmetic, thus reducing the effective size and solution time for 
the problem. The more rows one is able to GUB, the fewer rows one has to explicitly carry 
through the simplex operations. If the original problem has m constraints (of which p are GUB 
rows) and n variables, then at most only an (m-p x m-p) submatrix of the basis is needed for the 
explicit simplex operations. This contracted explicit basis means that many of the calculations 
are replaced with logical operations, yielding faster results and less numerical rounding error. For 
some problems GUB enables one to construct solutions that might otherwise be beyond capacity 


of available computers. 





A further benefit from GUB is that since at least one vanable of every binding GUB con- 
straint must be in the basis, this often suggests an excellent advanced initial basis. 

Many problem types have natural GUB structures embedded in them. 

a) ‘Transportation problems (pure, bounded and capacitated networks) 

b) Multi-product blending 


c) Raw material and/or production resources allocation (forest management, machine 
loading, plant scheduling) 


d) Operations planning (combined production/inventory/distribution planning) 

e) Resource assignment (1.e., freight cars, personnel)” [14] 

To better illustrate this, Figure (1) contains a presentation of a transportation type problem. 
Note that a GUB row set has been marked. For problems similar to this, a large GUB set can be 
quickly found by visual inspection. Likewise, in a particular class of models, knowledge of the 
model structure can lead to problem-independent specification of a proper set of GUB rows. 
(e.g., one can always GUB all the sinks of a pure network). But for many general models this 
technique cannot be used, and due to the large size of the problems, visual inspection may be 
limited to adjacent rows or patterns. This is very dependent on how the problem is written. 
Most contemporary problems for which GUB factorization may be crucial are so large that an 
explicit “picture”’ of the coefficient matrix is not even useful. (One notorious example has been 
encountered with a “picture” measuring 10 by 300 meters!) It is therefore highly desirable to 


have an automatic procedure whereby a GUB set can be efficiently identified. 


Seemeeae i 1 1 0 0 0 0 
0 0 1 1 1 1 
a1 0 0 0 0 0 0 0 
OFT | 0 oC —1 0 0 0 
Sinks 0 OF sl 0 Sm 0 0 }\a set of GUB rows 
0 0 Oe | 0 0 0 0 
0 0 0 0 0 0 -—l 0 
0 0 0 0 0 0 0 =i 


Matrix representation of a transportation problem 


Figure (1) 





In large problems there exist a huge number of subsets of rows that satisfy the GUB cri- 
teria. It is generally regarded that those subsets with more rows are ‘“‘better’’ GUB sets since 
they imply a more contracted explicit basis. The implied problem, then, is to find the maximum 
GUB set. 

Algorithms to find a maximum GUB row set for a problem do exist. These usually entail 
enumeration schemes and cannot be guaranteed to be efficient in a practical sense. Conceivably, 
2™ — m sets of rows might have to be searched before a maximum GUB structure is found. As 
the problem size grows, the number of possible sets that need to be checked increases exponen- 
tially. As will be shown later, the hope of finding an efficient algorithm to find the maximum 
GUB set for any general problem is dim. 

Therefore, researchers and practitioners have concentrated on constructing efficient heur- 
istic algorithms that attempt to identify, but do not guarantee, a maximum GUB set. A few of 
these methods showing great promise have been reported, but they have not been tested at large 
scale. 

This report outlines several automatic heumstic GUB finding procedures that have been 
developed and published in the recent literature. These procedures are tested on a suite of large 
scale, real life optimization problems, and are modified to improve their behavior. Comparative 
performance of the methods is given both in terms of the computational effort to identify a 
GUB set, as well as the quality of the GUB set achieved. 

Identification of GUB sets of maximum row dimension is shown in Chapter VI to be 
among the class of NP-complete problems. However, an easily computed upper bound on the 
size of the maximum GUB set is developed and used to objectively evaluate the quality of heur- 


istic GUB algorithms, showing that very nearly maximum GUB sets are routinely achieved. 
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Il. PROBLEM DEFINITION AND REPRESENTATIONS 


The Linear Programming problem is defined as 


(L) Min c’x 
st. r<Ax<r_ (ranged constraints) 
b<X<b _ (simple bounds) 


where r and Yr are m-vectors, x, c, b and b are n-vectors and A is an m x n matrix. The constraints 
are sometimes defined as equations, but for the general case of GUB treated here constraints can 
be equations, inequalities or a mixture. The immediate discussion will be directed at (L); below, 
the integer and mixed integer problems are treated. 

For identification of a GUB set of rows, only the matrix A is used and since the actual 
values of the non-zero elements of the matrix A are not required the associated matrix K 1s 


defined. 
( Oifa, = 0 
\ 1 otherwise. 


K = (kjj) : kj = 


As an example, consider the following linear programming problem. 


Min x; + 3x9 — Xg + 3x4 + X5 — 2% 
at. mi, + 2x5 + X3 = 4 
6X4 oa Ok = Lo 
—38x9 + 2.7%3 = 7 
6X4 ae illeag 8 
2X4 Xe = a 


The corresponding matrices A and K are: 


i 2 Z 0 0 0 
6 0 0 2 D 
A= | 0 —3 Zu 0 0 0 
0 0 0 6 0 4 
2 0 0 0 1 0 
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There are several ways one can model the maximum GUB problem. Three approaches are 
presented to aid in the understanding of the theoretical context of the heuristic methods 
examined and to highlight the formal complexity of the original problem. 

Two rows of K (or A) are said to conflict if there exists at least one column with non-zero 
coefficients in both rows. The GUB problem can be restated as that of finding a subset of the 
rows that do not conflict. 

A. GRAPH THEORY REPRESENTATION 

Consider the matrix K of the linear programming problem (L). A graphical representation 
of this matrix can be constructed through the following mapping rule, f. Let each row of K bea 
vertex of the graph. Should two rows of K conflict then the two vertices of the graph are joined 
by an edge. This mapping retains all the necessary conflict information. 

The graph associated with the example is presented in Figure (2). Note that it has five 
vertices, one for each row. Since row 1 conflicts with rows 2, 3, and 5, edges connect vertex 1 
to those vertices. The other edges represent the remaining conflicts. If two vertices, a and b, are 
joined by an edge, e, then a and b are adjacent, and a (or b) is incident with e. Since vertices 


2 and 3 are not adjacent, this indicates that the corresponding two rows in K do not conflict. 





Figure 2 


This introduces the notion of independence. Given a graph G = (V, E), a subset V’ V is 
said to be an independent set if no two of its elements are adjacent. It follows that if an inde- 
pendent set of vertices can be found in G then the corresponding rows of the matrix K do not 


conflict and thus define a GUB set. Conversely, a GUB set for K defines an independent set for 
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the graph G. It is also clear that an independent set for G is maximum if and only if the cor- 
responding GUB set for K is maximum. 

In the example problem, the maximum independent set is 3, 4, 5 . These are also the 
rows of the maximum GUB structure. 

Consider the set K,,, the set of all K-type matrices having m rows. The above mapping 
factors this set into a finite number of classes. Two matrices, K, and Ko are said to belong to 


the same class, C, if and only if each is mapped into the same graph, G,. 


K G 


€(K2) 


Figure 3 
Thus, an independent set of vertices of Go correspond to a GUB row set for every matrix in the 
class C. 


The incidence matnx N is defined as follows. 


N= (njj) : ni 


1 if vertex iis incident with edge } 


O otherwise 


For the example problem N would be: 


ey eo e3 e4 er 
al 1 1 0 0 
Vo | 0 0 1 1 1 
N= vg | 1 0 0 0 0 
v4 | 0 0 0 0 1 
vs LO 1 0 1 O | 


There exists one, and only one incidence matrix for each graph of G,where G ,is the set 


of all graphs having m vertices. 
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Since the set of all N-type matrices with m rows is a subset of Km: every class of K,, con- 
tains one and only one incidence matrix. In general, for the GUB problem, every m row matrix 
is equivalant to one of a finite number of incidence matrices. Superficially this may seem to be 


a simplification. But as shown in Chapter VI the GUB problem on N is as difficult as the indepen- 


dent set problem on G. The equivalent statements of the GUB problem do, however, offer 
different views of the problem which are helpful in considering algorithms for and analysis of 
the problem. [Note: In Garey and Johnson [5] it is shown that two other graph problems, the 
‘Vertex cover” and the ‘“‘clique’”’ problem, are equivalent to the independence problem, and 
hence the GUB problem. These problems do not seem to offer any additional insight for the 
GUB problem. } 
B. CONFLICT MATRIX REPRESENTATION. 

The first method is developed around a conflict matrix. This is a square matrix of dimen- 
sion m, defined by: 


1 if row i conflicts with row j in (L) 
O otherwise. 


For the example problem: 


M= 


- CO - = 
-H &- CO — 
oO Oo KY Oo 
oOo Ff OF KF OO 
HH © OC & » 


Note that this matrix is symmetric. The sum for any row (or column) indicates the number of 
other rows it is in conflict with, plus one. 

This sum is important in that it indicates for any particular row how many other rows 
would be subsequently excluded from a GUB set by its addition. 

The rows of a GUB structure can be rearranged to form an embedded identity matrix in M. 


Note that this is the case for rows 3, 4, and 5 in the example problem. 
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C. VECTOR SPACE REPRESENTATION 

The second heuristic approach can be modeled using vectors in an n-dimensional vector 
space, where n is the number of variables in the problem (L). Consider each row of K as a vector 
in this space, having unit length in those “‘dimensions’’ corresponding with its non-zero coeffi- 
cients. In the example problem, row 1 is represented by the following vector: 

(1, 1,1, 0,0, 0, ) 

R, the resultant vector from the sum of all the vectors of the rows of K, indicates the num- 
ber of conflicts, plus one, associated with each variable of (L). A hyper cube in n-space situated 
in the first orthant at the origin with length 1 in all positive directions denotes the feasible GUB 
region. Should R extend beyond this area, then the set of rows corresponding to the vectors 
determining R does not constitute a GUB structure. 

A gradient vector can be calculated indicating the direction of the shortest distance to the 
feasible region. It can be used to determine which row to remove from the set to obtain the 


largest movement on the desired direction. When R falls within the feasible region, the set of 


rows determining R constitutes a GUB set. 
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iI. EARLIER LITERATURE 

Two papers dealing with efficient GUB finding methods are worthy of special note. 

Brearly, Mitra and Williams [2] establish a very useful framework for study of methods 
for finding GUB structure, as well as an insightful discussion of these methods and a taxonomy 
for their classification. 

They define three sets consisting of the rows of the technological matrix A. The first set, 
the eligible set, is made up of every row of A that is individually eligible to belong in the GUB 
set. The structure set is a subset of the eligible set and includes all those rows currently con- 
sidered as members of the GUB set. The candidate set consists of those rows of the eligible set 
that are candidates for inclusion (or re-inclusion) in the GUB set. Every one of the methods 
examined in [2] is described in terms of manipulation of these sets. 

Each method of building a GUB set employs one of two basic strategies. The type I (row 
addition) strategy assumes initially that none of the rows belong to the GUB set. Then, based on 
a particular type I criteria for inclusion, rows are removed from the candidate set and either 
added to the structure set or dropped from further consideration. This ee continues 
until the candidate set is empty. The rows in the structure set form an admissable GUB struc- 
ture. 

The type II (row deletion) strategy takes the opposite approach and is divided into two 
phases. Methods of this type assume initially that all the eligible rows are elements of the 
structure set. This assumption normally leads to an infeasible GUB set with many conflicting 
rows. Based upon the particular type II decision rules, rows are removed from the structure set 
and placed in the candidate set. The fitst phase of this strategy ends when a feasible structure is 
obtained. 

The second phase involves examining the removed rows in the candidate set. Those that do 
not conflict with any of the members of the current structure set are taken from the candidate 
set and re-included in the structure set. Those that do conflict are deleted from the candidate 
set and dropped from further consideration. The second phase ends when the candidate set is 


empty. At this point the rows of the structure set consistute an admissable GUB set. 
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Brearly, Mitra, and Williams examine over 18 different methods. These approaches differ 
in the primary and secondary decision criteria for including (or removing) a row in the GUB 
structure set. The heuristic decision rules examined are based on the following model entities 
and combinations thereof: 

Include or remove a row based upon: 

a) the number of non-zero elements in the given row, 

b) the number of rows in conflict with the given row, 

c) the number of non-zero elements in rows that conflict with the given row, 

d) the row’s relative weight obtained by the inner product of a vector representation of 

the row and a directional gradient. 

Except for the last, these rules were tested with both strategies. As an example, I.1( type 
I strategy, method number 1) has as its primary decision rule for adding rows to the structure 
set: choose a row from the candidate set with the minimum number of non-zero elements. 
Method II.1 removes rows from the structure set based upon the maximum non-zero element 
count for each row. 

These methods were implemented with an ALGOL program run on an ICL 4130 computer. 
Several linear programming models were run with this program and the results (of 12) were 
presented. These problems range in size from 12 rows up to 166 rows. 

The results show that those methods using heuristic (d) above ‘“‘consistently performed 
very well” [2]. Similarly, those methods using heuristic (b) were found to perform nearly as 
well as (d). 

McBride [15] compares the directional gradient method (d) with an approach suggested, 
but not tested by Greenberg and Rarick [7]. The latter method uses the conflict matrix as does 
heuristic (b). However, it focuses on finding a maximum embedded identity matrix within the 
conflict matrix, rather than using the conflict matrix to determine ocanict counts, applying a 
specialization of the preassigned pivot procedure (Pp?) normally used for reinversion [8]. 
McBride’s results indicate that heuristic (d) is significantly faster. However, neither method 


consistently achieves a larger GUB set. 
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McBride also comments on the notion of a “good’’ GUB set. He finds merit in selecting a 
set of GUB rows that minimizes the non-zero build-up in the representation of the inverse 
transformation of the explicit basis, during actual optimization. Results are also given for a 
restricted GUB set selection that gives priority to equality constraints. Since equality con- 
straints are always binding in feasible solutions, the subset of the basis associated with binding 
constraints, or kernel [6] is expected to have less explicit non-zero elements. 

Based upon the results in these papers, and on independent computational experience 
with automatic GUB factorization reported by Brown and Graves [3], the present research 
was initially concentrated on those approaches utilizing the two most successful heuristics 
(1.2, II1.10 and variations). 

The models studied in this report are of a larger scale and include mixed integer problems 
as well as models for which pnor GUB row sets have been manually specified. 


Most of the notation and labels of [2] have been retained here for their clarity. 


IV. DETERMINATION OF THE ELIGIBLE SET 


The implementation of GUB in simplex algorithms usually admits only + 1 as non-zero 
coefficients in the GUB rows. In linear programming a scaling of columns can make each non- 
zero element in a GUB row ¢ 1. For variables of an integer or mixed integer programming prob- 
lem, the columns of matrix A that correspond to integer variables can not be scaled without 
destroying the integrality condition. Therefore, non-zero elements in columns corresponding to 
integer variables can be modified only by row scaling. If it is impossible to obtain the necessary 
+ 1 non-zero coefficients by row scaling and column scaling of columns corresponding to con- 
tinuous valued variables, the row is not eligible for inclusion in a GUB set. 

To provide the complete context of this research, the procedures examined for locating 
a GUB set in a linear programming problem are designed to be incorporated as an automatic, 
integral part of a contemporary optimizing system of advanced design. 

Each method is implemented as a feature of the read routine (written to accept input in 


the standard MPS format, as well as editing information indicating integer variables, scaling 
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and known prior GUB structure). Each method automatically examines the rows of the input 
and specifies a GUB set. The appropriate rows and columns are then scaled as necessary to 
obtain the proper GUB structure, and passed on to the optimizing portion of the system. (Note 
that the editing information places conditions that must be satisfied for any achievable GUB set.) 

In determining the set of eligible rows, the following factors have to be considered. 

a. Through the editing process, have some of the rows been dropped from the problem? 
If so, these rows are not eligible for inclusion in the GUB structure and are thus dropped from 
the set of eligible rows. 

b. Through the editing process, have any rows been predesignated to be in the GUB 
structure? (As previously mentioned, large segments of the constraints can often be selected for 
the GUB set either visually or by the implicit nature of the type of problem.) Any rows that 
conflict with these rows are not eligible for subsequent inclusion. 

c. All those rows that are designated ‘‘nonconstrained”’ (N) (which include the objec- 
tive function) are not eligible for inclusion in the GUB structure. All such rows, other than the 
objective function, are subsequently handled independently of the optimization. 

d. If there are any integer valued variables, an additional check is performed. A row in 
the GUB set must eventually be capable of being scaled to + 1 non-zero coefficients. This is 
achieved, if necessary, through a combination of row and column scaling. However, with integer 
variables, column scaling is no longer advisable. Therefore any row with a non-zero element in 
integer columns that is not a +1 or —1, or capable of being rendered into a + 1 in those positions 
through row scaling alone, must be marked as ineligible for inclusion in the GUB structure. 
Figure (4) gives the flow chart of how this procedure is implemented. 

Once the above restrictions have been considered, the resulting set of eligible rows is then 


available for search in order to construct the desired GUB structure. 
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V. IMPLEMENTATION OF AUTOMATIC GUB HEURISTICS 





A. CONFLICT METHODS 


The approaches 1.2 and II.2 employ the notion of a conflict measure for each row. Con- 
sider the conflict matrix, M, of the corresponding technological matrix A, for which a GUB set 
is to be found. An individual element, mj}, 1s 1 if row i and row k of the original matrix have at 
least one column j such that aij +0 and ay j70. If the two rows share no non-zero column then 
the corresponding mj; of the conflict matrix is 0. Summing across a row of the conflict matrix 
can thus give the measure of the number of rows plus one that are in conflict with a given row. 
For a given row, this sum less one indicates exactly how many other rows would be excluded 
from the GUB set by inclusion of this row. This second number is called the row’s deletion 
potential. 

Method I.2 initially places all the eligible rows on a candidate list. From the candidate list, 
individual rows are selected and removed to be added to the structure. Other rows that are in 
conflict with the selected row are immediately removed from the candidate list and discarded. 

The heuristic selects those rows on the candidate list with the minimum deletion potential 
to be added to the structure first. The selection of rows for the structure and the discarding of 
conflicting rows continues until the candidate list is exhausted. The resulting structure forms a 
GUB set. 

A modification to the above heuristic is possible which breaks ties among rows sharing the 
minimum deletion potential by selecting the row having the most non-zero elements for in- 
clusion into the GUB structure. 

The program used to test this heuristic approach is adapted from an earlier version made 
available by Glenn Graves. A step by step description of the method is given below. 

Step 1. Identify Eligible Rows. Set g; = 1 if row i is an eligible row, and equal to 0 

otherwise. 

Step 2. Determine Deletion Potential. Scan each eligible rowi and increment 8 ; by the 

number of other eligible rOWS k, where jj and a,j are both non-zero for at least one 


column j.( 8 ; is the deletion potential, plus one.) 
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Step 3. Stopping Condition. If all the ; = 0, stop. Otherwise, go to the next step. 

Step 4. Row Selection. Select row i having the minimum positive (‘‘deletion 

potential’’) 8 ; and add it to the structure. 

Step 5. Exclude Rows in Conflict with Selected Row. Locate the (& — 1) rows in con- 

flict with the selected row i. For each of these rows k, locate the (8). — 1) rows that they 

are in conflict with and decrement 8 ; for those rows by one. 

Step 6. Marking Selected and Excluded Rows Ineligible for Further Consideration. 

Set § and the B,,’s equal to zero. Go to step number 3. 

Only rows with g; > 0 are eligible. In step 1 B; is set to 1 for eligible rows. In the next 
step the 8’s for these rows are modified by each row’s deletion potential. Assuming there are 
still some eligible rows, the one with the smallest deletion potential is selected in step 4 for 
inclusion in the structure. In the next step all the rows conflicting with the one selected are 
identified for discard and the deletion potentials of the remaining rows are updated. In the last 
step, both the selected row’s weight and those of the discarded rows are set equal to zero. When 
all rows have either been selected or discarded the 8 array will be all 0’s. At this point the se- 
lected rows form a GUB structure. 

Method II.2 (row deletion) initially places all the eligible rows in the structure set. From 
this set individual rows are selected and placed on the candidate list in order of maximum dele- 
tion potential. During Phase 2 Brearly, Mitra, and Williams drop all rows from further considera- 


tion that conflict with the structure set and attempt to re-include remaining candidate rows 
(that do not conflict with the structure set) in LOFI order. A modification of phase 2 is used in 


this research which simply excludes from further consideration all conflicting rows, re-includes 


any remaining candidate rows, and repeats phase 1, until no further non-conflicting candidates 
remain. 
B. GRADIENT METHODS 
The second method (II.10) employs a heuristic method put forth by Senju and Toyoda 
[16] for approximate solution of certain linear programming problems with 0, 1 variables. The 
general problem that they address is that of choosing a most profitable combination (or port- 
folio) of orders subject to resource constraints and an all-or-nothing (0-1) restriction on the 


orders. (i.e., an order is not allowed to be only partially filled.) 
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This same format can be used to express the search for a maximal GUB structure in (L). 
The rows of the technological matrix A are treated like the orders in the Senju and Toyoda 
model in that they are to be either included with or excluded from the GUB set. The objective 
is to obtain a maximum number of rows in the GUB structure while satisfying the stipulation 
that the GUB rows be disjoint. This last restriction can be expressed as a set of resource restric- 
tions in the sense of Senju and Toyoda. 

In mathematical terms, the GUB finding problem can be formulated as follows: 

(S) Max Z = Sq Kot... . tae 
subject to 


2 kjyx <1 eae, 3 1 
1 x: = Oorl foralli 


where: 
m: is the number of candidate rows in (L), 
n: is the number of variables in (L), 


ki: is the (i,j) element of the matrix K, which in turn is the 0, 1 matrix associated 
with the matnix A, 


X;: is the variable which determines if row 1 is in the GUB set or not, 


Z: is the objective function. 


Senju and Toyoda outline a heuristic approach for obtaining a near-optimal solution for 
the problem they examine. Adapting their approach to the specialization (S) given above, a 


type II strategy results, with all the rows initially being included in the GUB structure. Using 
the constructive characterization of a vector space outlined earlier, consider each row of (S) 
as a vector in n-space. [n is the number of variables in (L)]. A resultant vector R is determined 
by the sum of all the included rows and, in general, extends beyond the feasible space denoted 
by the unit hyper cube. A gradient vector is calculated from this infeasible point in the direc- 
tion of the shortest distance to the feasible region. In Brearly, Mitra and Williams [2] this vector 
is labeled oll, An inner product of this gradient with each of the row vectors results in a relative 


weight for each row. These weights, which are stored in a vector labeled, can be viewed as 


indicating the relative contribution that the removal of the corresponding row would have to- 


wards obtaining a feasible structure. 
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Rows are removed from the structure set according to their relative weight, the largest 
weight being removed first. This process is continued until a feasible set of GUB rows has been 
obtained. (The gradient vector is not recomputed as the method proceeds.) 

Next, a phase 2 procedure is implemented which examines each of the initially removed 
rows to see if any can be re-included into the structure without violating the bounds of the unit 
hyper cube. Upon completion of phase 2, the selected rows constitute a GUB set. 

A variation on the above procedure recalculates the shortest distance to the feasible re- 
gion after the removal of each row. With the new gradient, a new set of relative weights for the 
remaining rows is then calculated and used, if necessary, to determine which of the subsequent 
rows will be removed. This method is named ITI.9. 

Another modification is possible if two rows are found with equal weights. As a tie- 
breaking rule, the row found to have the least number of non-zero coefficients is discarded first. 

A step by step outline of the heuristic approach follows: 

Phase I: Deletion of Infeasible Rows 

Step 0. Initialize Sets. Add all eligible rows to the structure set. The candidate set is 

empty. 

Step 1. Determining the Vector R. Foreach column j, define 0 js the number of 

rows in the structure set having non-zero elements in column j. 

Step 2. Determining Relative Weight of each Row. For each rowi,define  y ; as the 

sum of the (0; —1)ofevery column jj, for which aij += 0. 

Step 3. Feasibility Condition. If, for every column, 0 j <1, then go to step 6; else find 

acolumn jsuch that 0; ie, 

Step 4. Determining Row for Exclusion. Examine the rows in the structure having 

non-zero elements in column j. Select the row i with the largest vj. 

Step 5. Removal of Selected Row. Remove row i from the structure set, decrementing 


oj by one for every column j with aj; + 0. Add row i to the candidate set and return to 


step 3. 
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Phase 2: Improving on Feasible GUB set Found by Re-including Excluded Rows 

Step 6. Eliminate Rows in Candidate Set that Conflict with the Feasible Set. For every 
row i of the candidate set that has at least one aij *~ 0 in acolumn with o j7 1, remove that 
row from the candidate set. 

Step 7. Re-inclusion of Row. If any rows remain in the candidate set, then find row i 
having the smallest v;- Remove row i from the candidate set and re-include it in the struc- 
ture set. Increment p; by one for every column j where a = 0. 


Step 8. Stopping Condition. If the candidate set is empty, stop; else go to step 6. 


In step one, the vector 0 is calculated as the sum of all the individual row vectors of m. 
Step two calculates the relative weights that result from the inner product of the gradient vector 
with each of the row vectors. These are stored in the array y . The next step examines 0 to see 
if the vector is within the feasible region. If not, a row with the largest relative weight is 
removed from the structure set and the 9 vector is updated to reflect the sum of the row vec- 
tors remaining in the structure set. 

Once a feasible structure has been obtained, the candidate set (which consists of those 
rows initially removed) is scanned in step 6. Any of those rows found to still be in conflict with 
the rows of the structure set are discarded. Among those rows which remain, that with the 
smallest relative weight is re-included in the structure. This cycle of discarding and re-inclu- 
sion is continued until the candidate set has been emptied. The resulting rows of the structure 
set constitute a feasible GUB set. 

To modify the algorithm in order to compute a new gradient vector after the removal of 
each row in phase 1, step 5 is changed as follows: 

Step 5.* Removal of Selected Row. Remove row i from the structure, decrementing 

0 j by one for every column j such that aij + 0. Locate each row k that is in conflict with 

row i. Decrement \y, by the number of conflicts between the two rows. Add row i to the 

candidate set and return to step 3. 

Now when a row is removed from the structure set, the v ; contain the new relative 


weights equal to the inner product between the vector for row i and the new gradient. 
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These two basic methods have been implemented as integral modules of large scale op- 
timization system. Therefore, explicit conflict matrices are not built. (To have done so would 
have consumed too much computer time and space.) Instead, all the information is stored in the 
vectors 8, 0 ,and v. 

Logical flags associated with each row indicate whether it is eligible, and whether it is in 
the candidate set or in the structure set. 

As mentioned previously, the problem data is read in MPS format and expressed internally 
in terms of only the non-zero elements. This input is stored in a doubly linked list having both a 
row and a column thread. Thus, along with any non-zero coefficient aij; the location of adjacent 
non-zero elements in both the row i and column j are also immediately available. This crucial 
feature permits efficient row access for various operations (e.g., to locate all rows that conflict 
with a given row at a particular column.) 

C. COMPUTATIONAL RESULTS 

The heuristic methods have been tested on fifteen real-life problems that vary in size from 
92 constraints to 4, 648 constraints. A description of each of the problems in given in figure (5). 
As can be seen, four of the problems are mixed integer and two are pure integer. The experi- 
ments have been conducted using the FORTRAN H compiler on an IBM 360/67 computer at 
the W.R. Church computer center of the Naval Postgraduate School. All execution times re- 
ported are expressed in actual CPU seconds, accurate to the precision displayed. 

The results of these experiments are given in Appendix A. The first two columns give the 
rows and non-zero column elements, respectively, of the GUB structures found. The time given 
in column three is the time required to locate the GUB set once the set of eligible rows has been 
determined. The final columns give additional information relating to the two versions of the 
gradient methods examined and represents total time in phase 1 and the number of rows re- 


included in the GUB structure during the phase 2. 
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Problem Number Number Integer 


of rows of columns Columns Non-Zeros 
VANN 92 1,324 1,324 2,648 
NETTING 103 247 103 494 
AIRLP abe 3,040 0 6,023 
COAL Lil 3,753 0 7,506 
TRUCK 239 4,752 4,752 30,074 
CUPS 415 619 145 1,341 
FERT 606 9,024 0 40,484 
PIES 663 2.923 0 13,288 
PAD 695 2,934 0 13,459 
ELEC 785 2,800 0 8,462 
GAS 799 5,036 0 27,474 
FOAM 1,017 4,020 42 17 Sd 
LANG 1,236 1,425 0 22,028 
JCAP 2,487 3,849 560 9,510 
ODSAS 4,648 4,683 0 30,520 
Figure 5 


As with the earlier work cited, the Senju and Toyoda method has been found to be consis- 
tently the fastest. In general this holds true for both method II.9 and II.10. II.9, which updates 
the gradient after each row is removed, takes longer in phase 1 than its counterpart. However, 
it so selectively deletes the rows, that few if any rows are ever added back into the structure 
during phase 2. This suggests the possibility of implementing II.9 as only a one phase method. 

All methods are robust in that they find large GUB sets. The conflict approaches generally 
find a larger number of variables with non-zero coefficients in the GUB rows. However, this 
approach definitely becomes inefficient when larger problems are analyzed, regardless of the re- 
lative size of the GUB structure in the problem. 

There is some discrepancy between these results and those published earlier [2], especially 
with regard to the times of the other methods compared to II.10. The wide discrepancy be- 
tween II.9 and II.10 has not been observed in the current research. It is hypothesized that this is 
due partially to differences in implementation of the various approaches and partially to prob- 


lem size and structure variations between these studies. 
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VI. PROBLEM COMPLEXITY 





The complexity of a problem is said to be polynomial if an algorithm exists for which the 
fundamental operations are limited by a polynomial function of instrinsic problem dimensions. 
Such an algorithm would be called a polynomial time or good algorithm. The class of all pro- 
blems for which such algorithms exist is denoted (P). If an algorithm is not polynomial time, 
then it is defined to be an exponential time algorithm. The disadvantage of an exponential al- 
gorithm is seen in the explosive growth of the maximum solution time relative to a good al- 
gorithm as the dimensions of the problem increase [13]. 

A problem x is said to be reducible to a problem y if each good algorithm for solving y 
can be used to produce in polynomial time a good algorithm for solving x [11]. Note that this 
does not necessarily require that a gocd algorithm for x and y actually exist. This requires only 
that if one exists for y, then one also exists for x. 

An intractable problem is one for which it is known that no polynomial time algorithm 
exists. In between this class of problem, and the class P, is a vast number of problems whose 
status is uncertain. Among these is a class ofnondeterministic polynomial-time problems (NP) 
for which a polynomial time algorithm can be shown to exist that can verify a guessed solution, 
but for which the existence of a (determininstic) polynomial time algorithm to actually solve a 
problem has not yet been demonstrated. 

If every problem of the class NP is reducible to the problem y, then y is said to be NP- 
hard. In addition, if y itself belongs to NP, then y is NP-complete [5, 11]. 

The following problem is known on the literature as the independent set decision prob- 
lem (ISD). It belongs to the set of NP-complete problems. 

(ISD) Given a graph G = (V,EB) and an integer t, does G contain an independent set of size t 
or more. The GUB decision problem GUBD can be defined as follows: 

(GUBD) Given an mxn 0,1 matrix K and an integer p, does K contain a set of p or more 
rows 1j, lo ca lq such that 


q 
(+) > a) <1 forevery columnj; q2p. 
e=] 
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Given an instance of the ISD problem, the incidence matrix N can be constructed. This matrix 
along with the integer t is an instance of the GUBD problem. The following theorem proves the 
correctness of this reduction: 


Theorem: The incidence matrix N has t rows satisfying (*) if and only if there 
are t vertices in G that are independent. 


Proof: a) Assume there exists t rows of N that satisfy (*). They correspond 
to vertices v;_,v;_,...,Vv;, in G. If any two of these vertices are ad- 


1 q ly 


jacent, then 


where j is the column in N that corresponds to the edge connecting the two 
vertices. This is a violation of the assumption, hence the t vertices in G are not 
connected to one another. 

, V;_,...V; in G that are independent. 


‘2 Nt 


Since no two are adjacent, the corresponding rows in N satisfy (*). Q.E.D. [18] 


b) Assume there exists t vertices a 

Since the ISD problem, a problem known to be NP-complete, is reducible to the GUBD 
problem, it follows that the GUBD problem itself is NP-complete. (It is clear that the reduction 
is polynomial time and it is also clear that GUBD is in NP.) 

The related problems of finding a maximum independent set and a maximum GUB set are 
not in NP, however, they are NP-hard. It is therefore unlikely that a polynomial-time algorithm 
will be found for these problems. Only exponential-time algorithms are presently available. The 
above analysis of GUB algorithms has only indicated the worst case bound. No conclusions are 
made about the expected (i.e., average) performance of an algorithm. In other words, the 
possibility of the existence of an algorithm with a good expected performance times, but having 


an exponential worst case bound, has not been ruled out. 
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Vil. AN UPPER BOUND FOR THE SIZE OF MAXIMUM GUB SET 





The intrinsic difficulty of identifying a maximum GUB row set has been shown to be 
exponential, making this task essentially impossible for problems of the scale at hand. However, 
the efficient heuristic procedures have been shown to provide very large GUB sets, whose size 
appears to be relatively stable for each problem regardless of the particular method applied. 
This suggests that these large GUB sets may be, in fact, very nearly maximum, although there 
is no practical way to verify this directly. 

Although the problem of determining the size of the maximum GUB set is also NP-hard, 
it is possible to develop an easily computable upper bound on the maximum GUB set size. This 
bound can then be used to objectively evaluate the quality of the GUB sets produced by heur- 
istic algorithms. 

It is clear that the number of rows of a GUB set can be no greater than the number of rows 
in the problem. Also any one row by itself can form a GUB set. But these bounds are of little 
practical use when considering the problem. of identifying a maximum GUB set. Utilizing 
information that is already available in the heuristic procedure, it is possible to construct in 
polynomial time an upper bound on the size of the maximum GUB set. (It is also possible to 
construct a lower bound on the size of the maximum GUB set, but that topic is not pursued in 
this report. ) 

For the purpose of developing a better bound, the incidence matrix representation (N) of 
the problem is used. Let s; be the number of 1’s in row i. Note that s; is the number of edges 
incident to vertex i in G. Also note that s; = § ; —1. The number of columns in N represents 
the number of distinct conflicts that exist between the rows of the original problem. This 


number is denoted as_ c , and can be found by the following formula. 


rm jiMys 
PY 


If c is greater than 0, all the rows of N cannot simultaneously belong to a GUB set, which 
implies the cardinality of the GUB set is less than m. As c becomes larger, the following argu- 


ment shows that the upper bound of the maximum GUB set decreases. 
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If c_ is positive, but strictly less than m, it is possible for all the conflicts to involve one 
row. Removal of that row would then leave m — 1 rows that forma GUB set. Thus for ¢ in 
the range from 1 to m — 1, an upper bound on the maximum GUB is m — 1. Since one row can 
conflict with at most m — 1 other rows, once c 2m, at least two rows have to be removed to 
form a GUB set. Form<X c < [(m—1) + (m—2)] it is possible to construct a incident 
matrix such that all the conflicts are between a pair of rows and the remaining set of rows. Re- 
moval of the pair would result in a GUB set of m — 2 rows. This constructive argument con- 
tmves until c = (m) ( = — 1), the maximum number for ¢_ . This could occur when each 
row conflicts with every other row. At that point, the max maximum GUB = min maximum GUB 
= ie 

A graph of an upper bound on the maximal GUB for a 5 row problem such as the example 


problem is given below: 


— 
orf NM WwW FF MN 





Figure 6 


For the example problem, m = 5 and c = 5. From the above graph the upper bound on the max- 
imum GUB for that problem is 3. Since a GUB set containing three rows has already been iden- 
tified, that set is a maximum set. 

In general, for any problem with an m x c incidence matrix, the largest maximum GUB set 


that can be obtained 1s: 


ie oe ) .25 + (m) (m—1)— 2c 


The above bound is problem-independent and a sharp bound in that matrices with a GUB 


set the size of the bounding value can be constructed. 
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With additional information about a specific problem a better bound can be constructed. 
Since s; is the number of other rows that conflict with row i, removing row i from the set of 
rows reduces the number of conflicts, c, by s;. Let IMAX denote max s;. Since IMAX is the 
largest row conflict count, c can be reduced by no more than IMAX with the removal of each 
row. The minimum number of rows that would have to be removed to reduce the number of 
row conflicts to 0, is! c/IMAX. Therefore, given m, c and IMAX, the bound can be improved to 

oe m-r$ e<(m—y)y 
L.5+) 25 + y(Qm—y—1)—2c e¢>(m—y) (y) 
where y = IMAX . 

In order to determine IMAX, the entire 8 vector must be examined. 

A third, even better bound can be obtained with additional information on the frequency 
of the conflict counts from 1 to IMAX. The procedure is the same as above, in that when a 
row is removed with IMAX conflict count, c decreases by IMAX. However, instead of con- 
tinuing to decrease c by IMAX, it is decreased by the next largest s;. This procedure con- 
tinues until once again, c becomes zero. This bound is named ug. 

Each tighter bound requires more information about the particular problem. However, 
all the information is readily available since it is generated by the heuristics using the conflict 
measure. 

The bounds developed can be used 6 objectively evaluate the size of a GUB set found 
by heuristic methods. In two problems examined, VANN and AIRLP, the number of rows 
in the GUB set equal an upper bound on the maximal GUB set for the problem. Therefore, 
for those problems, the heuristic methods are verified to have located maximum GUB sets. 

Manual specification of a GUB set from visual inspection can utilize these bounds as an 
excellent measure of the maximum additional rows to be found. This information is also an aid 


in deciding whether to subject the problem to additional automatic searching for GUBs. 
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Vil. EXTENSIONS 

The upper bounds developed in this report vary from a problem independent bound, to 
tighter problem dependent bounds. It is speculated that additional information can be easily 
extracted from the actual conflict structure of the problems that can be used to tighten the 
existing bounds even further. In addition, lower bounds can be developed by similar methods. 

Another area that warrants further study is the special structure of the incidence matrix 
representation of the original problem. It is noted that for an incidence matrix, N, the relative 
weights generated for each row are (except for a constant) identical for both methods studied. 
This implies that for a matrix N, and the same strategy (i.e., II), the two heuristics would 
identify the same GUB set. 

Finally, research is continuing with automatic location of network row structure. As one 
illustration of an immediate generalization of the GUB results, a GUB set for a problem can 
be identified and then another GUB set of an eligible subset of remaining rows can be found. 
Thus, a bi-partite network row factorization can be achieved (e.g., transportation or assignment 


rows) . This problem is being further examined by Wnght [17]. 
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IX. CONCLUSIONS 

The computational benefits of a large GUB set for an LP problem are widely recognized. 

The need for an algorithm that can extract a maximum GUB from contemporary large scale 
linear programming models is therefore apparent. This report shows that the identification of a 
maximum GUB set is a difficult problem, essentially as hard as many other widely known 
difficult problems. 

An alternate approach is the use of an heuristic. This report has examined two promising 
methods (with two versions of each) with application to a series of real life, large scale models. 

All versions are robust in their ability to find large GUB sets of rows. However, the two versions 
(11.9 and 11.10) that use the Senju and Toyoda method are consistently the fastest. These two 
methods are essentially equal in their efficiency and effectiveness. Since version II.9 (which 
recalculates the gradient after the removal of each row) so selectively removes the rows during 
the first phase that few if any rows are re-included in the GUB set during the second phase, it 
suggests the possibility of implementing this version as only a one phase (row deletion) method. 

The representation of an infinite number of m row matrices by a finite number of inci- 
dence matrices offers a powerful and concise way of examining the GUB problem. Under this 
representation, both basic heuristic methods investigated assign (within a constant) the same 
relative selection weights to each row. 

Finally, the ability of defining upper bounds on the maximum size of the GUB set gives a 
new powerful tool in this area. It enables one to evaluate the quality of GUB sets found even in 
very large problems, for which the algorithmic identification of the maximum GUB set is pro- 
bably impossible in general. In some cases, verification of a heuristically achieved maximum 
GUB set is now possible. Further, the bounds developed may be further enhanced in future 


research, and may be applicable to related problems of equivalent complexity. 
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Appendix A 
This appendix contains the computational results for the fifteen linear, mixed integer and 
integer models examined. The experiments have been conducted using the FORTRAN H com- 
piler on an IBM 360/67 computer at the W. R. Church computer center of the Naval Postgra- 
duate School. All excution times reported are expressed in actual CPU seconds, accurate to the 
precision displayed. 
For clanty, the following terms are defined: 
Eligible rows: The number of rows of the model that were initially eligible for inclusion 
in a set of GUB rows. 
Conflict count: The number of columns of the incidence matrix for the problem, 
Conflict density: The ratio of the conflict count to the maximum conflict count 
for that problem size. [i.e., (m) (m —1)] 


Time to find Elig: The time in CPU seconds to determine the set of eligible rows. 
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Columns : 


Integer 


Non-zero : 


Method 


1.2 


Be2 


11.9 


IT.10 


Problem 
Rows 


Columns : 


Integer 


Non-zero : 


Method 


ix 


IT.2 


I1.9 


11.10 


COAL 
il 


3753 
0 
7506 


Rows in 
GUB set 
111 
111 
111 
100 
TRUCK 
239 
4752 


4752 
30074 


Rows in 
GUB set 
32 
30 
30 
a2 
CUPS 
415 
619 


145 
1341 


Rows in 
GUB set 


213 
214 
214 


200 


Description Energy Development Model 
Eligible rows 170 IMAX : 
Conflict count : 3753 Ul : 
Conflict density 26.13% U2 
Time to find Elig : 106 sec U3 
Columns in Time to find Time in 
GUB set GUB set (sec.) Phase 1 
3753 1.38 
oneo de 
3753 920 912 
2568 .641 .631 
Description Fleet Dispatch Model 
Eligible rows : 221 IMAX : 
Conflict count 10438 i ° : 
Conflict density 42.94% U2 
Time to find Elig : 116 sec U3 
Columns in Time to find Time in 
GUB set GUB set (sec.) Phase 1 
1069 6.88 
1099 7.095 
857 5.00 4.95 
986 1.70 1.58 
Descnption Production Scheduling Model 
Eligible rows : 390 IMAX : 
Conflict count: 144 U1 
Conflict density : 98% U2 
Time to find Elig : 042 sec U3 
Columns in Time to find Time in 
GUB set GUB set (sec.) Phase 1 
494 2.96 
442 3.15 
466 e212 194 
394 .384 132 
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dalek 
146 
136 
BUPA 


Number added 
in Phase 2 


171 
165 
159 
144 


Number added 
in Phase 2 


48 
388 
374 
294 


Number added 
in Phase 2 


24 





Problem 
Rows 


Columns : 


Integer 


Non-zero : 


Method 


[.2 


11.2 


01.9 


11.10 


Problem 
Rows 


Columns : 


Integer 


Non-zero : 


Method 


1.2 


11.2 


11.9 


11.10 


Problem 
Rows 


Columns : 


Integer 


Non-zero : 


Method 


L.2 


1.2 


1.9 


11.10 


FERT 
606 
9024 


40484 


Rows in 
GUB set 


PIES 
663 
2923 
0 
13288 
Rows in 
GUB set 
180 
169 
17 


177 
PAD 

695 

2934 


0 
13459 


Rows in 
GUB set 
200 
189 
188 


189 


Description Production & Distribution Model 
Eligible rows 605 IMAX: 580 
Conflict count 16455 U1 : O77 
Conflict density 9.01% U2 576 
Time to find Elig : 201 s8eC [3 067 
Columns in Time to find Time in Number added 
GUB set GUB set (sec.) Phase 1 in Phase 2 

9024 15.8 

9024 10.5 

9024 6.73 6.71 0 

9024 2.52 2.50 0 
Description Energy Production & Consumption Model 
Eligible rows 662 IMAX : 21 
Conflict count 4116 Ul 655 
Conflict density 1.88% U2 466 
Time to find Elig : .866 sec U3 422 
Columns in Time to find Time in Number added 
GUB set GUB set (sec.) Phase 1 in Phase 2 

1848 10.8 

1693 13.5 

1811 2.82 241 1 

1761 Loi .788 28 
Description Energy Production & Consumption Model 
Eligible rows 694 IMAX : 23 
Conflict count 4416 Um -: 687 
Conflict density 1.84% U2 502 
Time to find Elig : .104 sec 3 449 
Columns in Time to find Time in Number added 
GUB set GUB set (sec.) Phase 1 in Phase 2 

1864 Louk 

1771 16.6 

1708 3.34 3.26 Z 

1275 1.35 .928 21 


38 © 








Problem 
Rows 


Columns : 


Integer 


Non-zero : 


Method 


1.2 


II.2 


II.9 


11.10 


Problem 
Rows 


Columns : 


Integer 


Non-zero : 


Method 


1.2 


1.2 


I.9 


11.10 


Problem 
Rows 


Columns : 


Integer 


Non-zero : 


Method 


lw2 


1.2 


I 


11.10 


ELEC 
785 
2800 
0 
8462 


Rows in 


GUB set 


309 


210 


309 


309 


GAS 
799 
5536 
0 
27474 
Rows in 
GUB set 
583 
639 
608 
639 
FOAM 
1017 
4020 
42 
17187 


Rows in 
GUB set 


932 


932 


oly 


917 


Description 
Eligible rows : 784 
Conflict count 6167 
Conflict density 2.01% 
Time to find Elig : -089 sec 
Columns in Time to find 
GUB set GUB set (sec.) 
2461 11.4 
2791 16.1 
2641 1.15 
2605 842 
Description 
Eligible rows : 789 
Conflict count Zee) 
Conflict density 7.15% 
Time to find Elig : .151 sec 
Columns in Time to find 


GUB set GUB set (sec.) 
5102 16.2 
5536 10.4 
5309 3.79 
5533 1.47 

Description 

Eligible rows 1006 

Conflict count 8186 

Conflict density 1.62% 

Time to find Elig : 225 

Columns in Time to find 


GUB set GUB set (sec.) 
4020 23.4 
4020 9.47 
3981 eis 
3981 902 


39 


IMAX : 
U1 
U2 
U3 


Time in 
Phase 1 


1.12 


so0 


Production Scheduling Model 


IMAX : 
U1 : 
U2 
U3 
Time in 
Phase 1 


3.77 


1.44 


Production Scheduling Model 


IMAX : 
U1 : 
U2 
U3 
Time in 
Phase 1 


1.71 


3849 


Energy Production & Consumption Model 


22 
776 
903 
492 


Number added 
in Phase 2 


14 


608 
760 
752 
652 


Number added 
in Phase 2 


261 
he) 
974 
934 


Number added 
in Phase 2 





Problem 
Rows 


Columns : 


Integer 


Non-zero : 


Method 


I.2 


11.2 


1.9 


11.10 


Problem 
Rows 


Columns : 


Integer 


Non-zero : 


Method 


[.2 


H.2 


W.9 


II.10 


Problem 
Rows 


Columns : 


Integer 


Non-zero : 


Method 


{-2 


m2 


1.9 


11.10 


LANG 
1236 
1425 

0 

22028 


Rows in 
GUB set 
382 
338 
342 
342 
JCAP 

2487 
3849 


560 
9510 


Rows in 
GUB set 
529 
5 BY 
529 
523 
ODAS 

4648 
4683 


0 
30520 


Rows in 
GUB set 


751 


721 


749 


751 


Description Equipment & Manpower Scheduling Model 
Eligible rows 1235 IMAX : 184 
Conflict count 46424 wil 1196 
Conflict density : 6.09% U2: 982 
Time to find Elig : 72sec 3: 973 
Columns in Time to find Time in Number added 
GUB set GUB set (sec.) Phase 1 in Phase 2 
1207 46.2 
908 54.2 
923 14.9 14.8 2 
922 12.4 1.13 234 
Description Production Scheduling Model 
Eligible rows 2446 IMAX: 488 
Conflict count 16578 Ul : 2439 
Conflict density : -097% ge =: 2412 
Time to find Elig : .2609 sec [3 : 1812 
Columns in Time to find Time in Number added 
GUB set GUB set (sec.) Phase 1 in Phase 2 
2072 104 
2186 oe 
2087 22238 1.87 5 
1393 3.98 1.10 59 
Description Manpower Planning Model 
Eligible rows 4647 IMAX: 4194 
Conflict count 9220 U1 ss: 4645 
Conflict density : .05% U2: 4645 
Time to find Elig : 263 sec U3: 4024 
Columns in Time to find Time in Number added 
GUB set GUB set (sec.) Phase 1 in Phase 2 
3116 369 
3846 651 
4436 AZ 6.88 0 
8020 . 3.01 2201 2 


40 
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